Search CORE

30 research outputs found

Automatically assessing and improving code readability and understandability

Author: Scalabrino Simone
Publication venue: Università degli studi del Molise
Publication date: 08/04/2019
Field of study

Automatically Generating Dockerfiles via Deep Learning: Challenges and Promises

Author: Bavota Gabriele
Mastropaolo Antonio
Oliveto Rocco
Rosa Giovanni
Scalabrino Simone
Publication venue
Publication date: 28/03/2023
Field of study

Containerization allows developers to define the execution environment in which their software needs to be installed. Docker is the leading platform in this field, and developers that use it are required to write a Dockerfile for their software. Writing Dockerfiles is far from trivial, especially when the system has unusual requirements for its execution environment. Despite several tools exist to support developers in writing Dockerfiles, none of them is able to generate entire Dockerfiles from scratch given a high-level specification of the requirements of the execution environment. In this paper, we present a study in which we aim at understanding to what extent Deep Learning (DL), which has been proven successful for other coding tasks, can be used for this specific coding task. We preliminarily defined a structured natural language specification for Dockerfile requirements and a methodology that we use to automatically infer the requirements from the largest dataset of Dockerfiles currently available. We used the obtained dataset, with 670,982 instances, to train and test a Text-to-Text Transfer Transformer (T5) model, following the current state-of-the-art procedure for coding tasks, to automatically generate Dockerfiles from the structured specifications. The results of our evaluation show that T5 performs similarly to the more trivial IR-based baselines we considered. We also report the open challenges associated with the application of deep learning in the context of Dockerfile generation

arXiv.org e-Print Archive

Studying the Usage of Text-To-Text Transfer Transformer to Support Code-Related Tasks

Author: Bavota Gabriele
Cooper Nathan
Mastropaolo Antonio
Oliveto Rocco
Palacio David Nader
Poshyvanyk Denys
Scalabrino Simone
Publication venue
Publication date: 01/01/2021
Field of study

Deep learning (DL) techniques are gaining more and more attention in the software engineering community. They have been used to support several code-related tasks, such as automatic bug fixing and code comments generation. Recent studies in the Natural Language Processing (NLP) field have shown that the Text-To-Text Transfer Transformer (T5) architecture can achieve state-of-the-art performance for a variety of NLP tasks. The basic idea behind T5 is to first pre-train a model on a large and generic dataset using a self-supervised task ( e.g: filling masked words in sentences). Once the model is pre-trained, it is fine-tuned on smaller and specialized datasets, each one related to a specific task ( e.g: language translation, sentence classification). In this paper, we empirically investigate how the T5 model performs when pre-trained and fine-tuned to support code-related tasks. We pre-train a T5 model on a dataset composed of natural language English text and source code. Then, we fine-tune such a model by reusing datasets used in four previous works that used DL techniques to: (i) fix bugs, (ii) inject code mutants, (iii) generate assert statements, and (iv) generate code comments. We compared the performance of this single model with the results reported in the four original papers proposing DL-based solutions for those four tasks. We show that our T5 model, exploiting additional data for the self-supervised pre-training phase, can achieve performance improvements over the four baselines.Comment: Accepted to the 43rd International Conference on Software Engineering (ICSE 2021

arXiv.org e-Print Archive

Università degli Studi del Molise: IRIS

A comprehensive model for code readability

Author: Oliveto Rocco
Scalabrino Simone
Publication venue
Publication date: 01/01/2018
Field of study

Università degli Studi del Molise: IRIS

Using Transfer Learning for Code-Related Tasks

Author: Oliveto Rocco
Scalabrino Simone
Publication venue
Publication date: 01/01/2022
Field of study

Università degli Studi del Molise: IRIS

An Adaptive Search Budget Allocation Approach for Search-Based Test Case Generation

Author: Oliveto Rocco
Scalabrino Simone
Publication venue
Publication date: 01/01/2021
Field of study

Università degli Studi del Molise: IRIS

Investigating the Use of Code Analysis and NLP to Promote a Consistent Usage of Identifiers

Author: Oliveto Rocco
Scalabrino Simone
Publication venue
Publication date: 01/01/2017
Field of study

Università degli Studi del Molise: IRIS

An Empirical Investigation on the Readability of Manual and Generated Test Cases

Author: Gall Harald C
Grano Giovanni
Oliveto Rocco
Scalabrino Simone
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/05/2018
Field of study

Software testing is one of the most crucial tasks in the typical development process. Developers are usually required to write unit test cases for the code they implement. Since this is a time-consuming task, in last years many approaches and tools for automatic test case generation - such as EvoSuite - have been introduced. Nevertheless, developers have to maintain and evolve tests to sustain the changes in the source code; therefore, having readable test cases is important to ease such a process. However, it is still not clear whether developers make an effort in writing readable unit tests. Therefore, in this paper, we conduct an explorative study comparing the readability of manually written test cases with the classes they test. Moreover, we deepen such analysis looking at the readability of automatically generated test cases. Our results suggest that developers tend to neglect the readability of test cases and that automatically generated test cases are generally even less readable than manually written ones

ZORA